A retrospective study differentiating nontuberculous mycobacterial pulmonary disease from pulmonary tuberculosis on computed tomography using radiomics and machine learning algorithms

Abstract Objective To evaluate the effectiveness of a machine learning based on computed tomography (CT) radiomics to distinguish nontuberculous mycobacterial pulmonary disease (NTM-PD) from pulmonary tuberculosis (PTB). Methods In this retrospective analysis, medical records of 99 individuals afflicted with NTM-PD and 285 individuals with PTB in Zhejiang Chinese and Western Medicine Integrated Hospital were examined. Random numbers generated by a computer were utilized to stratify the study cohort, with 80% designated as the training cohort and 20% as the validation cohort. A total of 2153 radiomics features were extracted using Python (Pyradiomics package) to analyse the CT characteristics of the large disease areas. The identification of significant factors was conducted through the least absolute shrinkage and selection operator (LASSO) regression. The following four supervised learning classifier models were developed: random forest (RF), support vector machine (SVM), logistic regression (LR), and extreme gradient boosting (XGBoost). For assessment and comparison of the predictive performance among these models, receiver-operating characteristic (ROC) curves and the areas under the ROC curves (AUCs) were employed. Results The Student’s t-test, Levene test, and LASSO algorithm collectively selected 23 optimal features. ROC analysis was then conducted, with the respective AUC values of the XGBoost, LR, SVM, and RF models recorded to be 1, 0.9044, 0.8868, and 0.7982 in the training cohort. In the validation cohort, the respective AUC values of the XGBoost, LR, SVM, and RF models were 0.8358, 0.8085, 0.87739, and 0.7759. The DeLong test results noted the lack of remarkable variation across the models. Conclusion The CT radiomics features can help distinguish between NTM-PD and PTB. Among the four classifiers, SVM showed a stable performance in effectively identifying these two diseases.


Introduction
Nontuberculous mycobacterial pulmonary disease (NtM-PD) refers to a group of infections caused by mycobacteria other than the Mycobacterium tuberculosis complex, which causes tuberculosis (tB) [1][2][3].NtM are bacteria commonly found in the environment, mainly in soil and water sources.interestingly, while most people are exposed to NtM daily, only a small proportion of individuals develop NtM-PD.NtM-PD typically affects individuals with certain underlying conditions, such as chronic lung diseases (e.g.bronchiectasis or chronic obstructive pulmonary disease), immune system disorders, or structural abnormalities of the lungs [4,5].however, in some cases, it can occur in otherwise healthy individuals.symptoms of NtM-PD may include persistent or chronic cough, shortness of breath, fatigue, weight loss, and occasionally coughing up blood.these symptoms are quite similar to those of other respiratory conditions, such as pulmonary tuberculosis (PtB), making the diagnosis challenging.currently, the main methods of distinguishing between NtM-PD and PtB encompass the sputum culture of mycobacterium and the identification of the species.Nevertheless, these methods are resourceintensive, time-consuming, and demand advanced laboratory facilities.Furthermore, the clinical treatment strategies for NtM-PD and PtB are completely different; therefore, early and effective clinical diagnostic methods are needed.some research has demonstrated that NtM-PD typically exhibits specific changes in computed tomography (ct) imaging [6][7][8][9].certain imaging features can help with identification, such as cavity formations, parenchymal lesions, tree-in-bud patterns, and bronchiectasis [9][10][11].however, these features do not offer adequate and effective markers for discriminating between the NtM-PD and PtB.
Radiomics, an emerging field in medical imaging, focuses on the extraction and analysis of quantitative data from radiographic images [12,13].it involves the conversion of radiographic images into mineable data, thereby allowing the detection of previously hidden information and patterns.through the utilization of cutting-edge technologies, including machine learning (Ml) and artificial intelligence, radiomics aims to extract relevant features from medical images that can be correlated with patient diagnosis, treatment response, and disease prognosis.Notably, radiomics has demonstrated significant promise in the differentiation and accurate diagnosis of lung diseases, such as pulmonary nodules and early-stage lung cancer [14,15].consequently, it offers a potentially reliable approach for distinguishing between NtM-PD and PtB. in this study, the maximum lesion extraction radiomics was more clinically feasible than representative characterization of lesions extraction radiomics.We expect to select the best-performing Ml model to produce results similar to or better than previous studies by using the maximum lesion extraction radiomics.

Patients and datasets
herein, a retrospective approach was utilized for the collection of data from individuals with NtM-PD or PtB who had undergone non-contrast ct examinations at Zhejiang chinese and Western Medicine integrated hospital (ZJcWMih) between January 2018 and January 2020.this study was performed in accordance with the Declaration of helsinki regarding ethical principles for research involving using human samples.the ZJcWMih institutional review committee granted its approval for this study (2023-Ys-139), and informed consent was not required.
all participants had microbiologically confirmed NtM-PD or PtB.Pathogenic microbiological examination was employed as the diagnostic criteria for NtM-PD and PtB.sputum samples were collected and subjected to bacterial cultures or strain identification.the löwenstein-Jensen medium was utilized for the growth of Mycobacterium culture.the identification of the NtM species was carried out using the matrix-assisted laser desorption ionization-time of flight mass spectrometry.to be more specific, the diagnosis of NtM-PD adhered to the guidelines provided in the 'treatment of Nontuberculous Mycobacterial Pulmonary Disease: an Official ats/eRs/escMiD/iDsa clinical Practice Guideline' (2020) [4].additionally, the diagnosis of PtB followed the criteria stipulated by the National health commission of the People's Republic of china (Diagnostic criteria for pulmonary tuberculosis [Ws 288-2017]) [16].
to retain the most relevant data per our research objective, individuals with both diseases, other pulmonary conditions (encompassing neoplasms, interstitial lesions, or infectious diseases), prior thoracic surgical interventions, people living with hiV or compromised ct images due to respiratory motion or metal artefacts, were excluded.
this study included 384 patients, comprising 99 individuals with NtM-PD and 285 individuals with PtB. the random categorization of the individuals under study into two cohorts (training and validation) was based on a ratio of 8:2.ct images were obtained within 1 month prior to sample collection for pathogenic microbiological examination, and no clinical treatment was administered during this period (Figure 1).

Imaging data
ct was performed for all participants using a spiral ct scanner (Brightspeed, Ge, Usa) following the same protocol.throughout the scanning procedure, individuals were positioned in a supine posture and were directed to inhale maximally.they were then asked to hold their breath to ensure data accuracy during the breathing period.a comprehensive scan was performed, covering both lungs from the apical regions to the basal regions.taking into account the body shape of the individual, further adjustment of the field of view was done.the ct parameters, utilizing a 512 × 512 in-plane size, were configured as mentioned: tube potential, 120 kV; automatic tube current, 0.75 s/r; and collimator width, 16 × 1.25 mm.these scans were reconstructed using both standard and lung algorithms (thickness, 2.5 mm; interval, 2.5 mm). the voxel size of Brightspeed ranged from 0.31 to 1.14 mm 3 .

Label annotation
the itK-sNaP (v 3.8.0,http://www.itksnap.org/)was utilized to label radiographic characteristics, such as cavities, bronchiectasis, nodules, pleural effusion, and consolidation.two respiratory physicians with 1-3 years of experience (YZ and hY) received a trial-and-error training process to ensure the utmost accuracy.Following their training and the annotation of cases, label verification was carried out by a senior radiologist (WY), who possessed a decade of experience in the field.this verification process was performed without access to clinical information to guarantee consistency.simultaneously, the senior radiologist compiled a list of any errors or cautions in label annotation, affording the junior physicians an opportunity to address and rectify their mistakes.

Feature extraction
the area with the largest radiographic characteristics was selected in each patient for extraction of radiomic features [12].the extraction process was executed via Pyradiomics (v 3.6.2),yielding 2153 original features in total [17].

Feature dimension reduction and ML approach
an 8:2 ratio was utilized for the random categorization of the individuals under study into the training and validation cohorts.initially, the student's t-test and levene test were performed for feature dimension reduction.the least absolute shrinkage and selection operator (lassO) logistic regression algorithm, which employs a penalty parameter tuned via 10-fold cross-validation, was employed to identify the most significant features with non-zero coefficients in the training cohort.specifically, in the training cohort, the lassO binary regression model was employed for the selection of 29 (minimum) or 23 (1 standard deviation) radiomic features.
the radiomic features of the training cohort were utilized for the development of four models that were utilized in the training process for the Ml algorithms.these models encompassed support vector machine (sVM), random forest (RF), logistic regression (lR), and extreme gradient boosting (XGBoost).the easy-to-operate lR model is frequently used to investigate the effect of trait variables on the target variable, which is typically a binary classifier [18].
as its name suggests, the RF model, designed to mitigate training variation and enhance model generalization and integration, is an Ml classifier that employs multiple trees for training and sample prediction [19]. in this research, 240 trees were tuned to achieve the minimum loss rate for 23 radiomic features, whereas 29 radiomic features were represented by 680 trees.
sVM, another widely applied technique in Ml algorithms, is a kernel-based approach.it enables the transformation of the feature space with multi-dimensional attributes into two categories [20].after choosing the linear kernel and setting the cost to 0.0312, gamma to 0.05, and epsilon to 0.45, the best-performing sVM model for the 23 radiomic features was achieved.Meanwhile, the best-performing sVM model for the 29 radiomic features was attained with a cost of 0.25, gamma of 0.05, and epsilon of 0.4.
XGBoost, a state-of-the-art Ml algorithm, has been detailed by chen et al. [21].it had a more complex parameter configuration. the gbtree booster was selected, where the min_child_weight was set to 0.8, gamma to 0.8, subsample to 1, and colsample_bytree to 1.For the 23 radiomic features, the chosen parameters for optimal performance included an eta of 2.31, nrounds of 29, and max_depth of 5. the optimum parameters for the 29 radiomic features were eta at 1.18, nrounds at 8, and max_depth at 3. all other parameters were kept at the default settings.

Performance evaluation and statistical analysis
to assess the performance of every model, the receiver-operating characteristic (ROc) curves were generated, and the areas under the ROc curves (aUcs) were computed to assess the predictive power of the models, which was compared using the outcomes of the Delong test [22].
Baseline features of the individuals were outlined using frequency tables and descriptive statistics.additionally, the proportions across various infection groups were comparatively assessed through the chi-square (χ 2 ) test.lassO regression was utilized to strike a balance between overfitting and underfitting among the variables.this enabled the identification of the radiomic features that are most crucial for distinguishing between NtM and M. tuberculosis lung diseases in this research.R v 3.6.2(https://www.r-project.org/) was utilized to perform statistical analyses.the analysis involved the use of the 'corrplot' , 'caret' , 'readr' , 'randomForest' , 'glmnet' , 'pROc' , 'Matrix' , 'rms' , 'e1071' , 'rmda' , 'xgboost' , and 'nsROc' packages.additionally, statistical Product and service solutions v 23.0 (iBM) was utilized for certain aspects of the analysis.two-sided p values >0.05 indicated that no remarkable variation existed across the predictive models in terms of their diagnostic performance.

Patient characteristics
this study encompassed 99 individuals with NtM-PD and 285 with PtB.table 1 describes the characteristics of the participants in the training (n = 307) and validation cohorts (n = 77).During diagnosis, the median age of the participants was calculated to be 52.5 years (interquartile range, 30-65 years).additionally, the acquired data indicated that there were 153 (39.8%) females among the participants (training cohort, n = 125 [40.7%]; validation cohort, n = 28 [36.4%]).the distribution of the baseline features across these two cohorts exhibited no significant differences.

Clinical radiologic diagnosis
Based on the ct images of the patients with NtM-PD, 15% were correctly diagnosed radiologically and 64% were misdiagnosed with PtB.however, the remaining 21% were diagnosed with pneumonia, bronchiectasis, tumours, and other diseases.

Discussion
NtM-PD refers to lung diseases caused by mycobacteria other than the M. tuberculosis complex and M. leprae.Previous studies have identified over 190 NtM types, with certain types known to be pathogenic agents [2][3][4][5].it is worth noting that certain regions and countries have seen a rise in the onset and prevalence of NtM-PD [23][24][25].Presently, the unique means of identifying NtM involves bacterial culture and strain identification, despite the time-consuming nature of the process.therefore, early diagnosis and prompt treatment of NtM are of utmost importance.
Nevertheless, distinguishing between NtM-PD and PtB can be challenging due to a significant overlap in symptoms and subtle variations in ct images.even in specialized hospitals, the accuracy rate of clinical radiologic diagnosis is rather low.this study explores Ml to differentiate the individuals with NtM-PD from the PtB ones using ct images.Our research findings revealed that the proposed Ml model exhibits significant promise in achieving this distinction.
Previous relevant radiomic studies were based on characteristic radiographic features, such as cavity formation and bronchiectasis [26,27].Xing et al. used the linear sVM method in the radiomics study of 59 patients with NtM-PD and 57 patients with PtB [26]. the aUcs of cavity formation and bronchiectasis were recorded to be 0.70 ± 0.07 and 0.84 ± 0.06, respectively.[27].the aUcs of the training group exceeded 0.97, whereas those of the validation group were greater than 0.84, and those of the external validation group were recorded to be higher than 0.84. the Ml algorithms have shown good performance in identifying these two diseases.however, the clinical incidence of characteristic radiographic features is not high, and the regional difference is quite large.Kang et al. recorded 421 cases of NtM-PD, and the incidence of cavitation was 21.9% [28].a review by hu et al. of 154 patients with NtM-PD in Nanjing, china, showed that the incidence of bronchiectasis was approximately 39.1% [29].additionally, lou et al. reviewed 513 patients with NtM-PD in shanghai chest hospital and found that the incidence rates of bronchiectasis and cavitation in patients with different NtM sub-bacteria were 34.5%-84.1% and 39.1%-85.7%,respectively [30].therefore, it seemed more practical for us to select the region with the largest characteristic radiographic features in each patient for radiomics feature extraction.
ensuring the repeatability and reproducibility of radiomics features in related research can be quite challenging.it is worth noting that the aforementioned studies used a slice thickness of 5 mm, whereas this research used a much thinner slice of 2.5 mm [26,27].the reconstruction slice thickness has a remarkable influence on the radiomics features, making a thinner slice thickness a more stable option for achieving stable results [31].some studies revealed that lOG and Original radiomics features have good stability, whereas Wavelet radiomics features have relatively poor stability [32].the 23 or 29 radiomics features screened out in our study showed more stable features than those reported in previous studies.Nevertheless, large-sample studies and further research are warranted to confirm these findings.
an additional crucial consideration pertains to selecting the most suitable model.Yan et al. believed that the lR algorithm model is better due to its high precision, recall, and F1 score [27].although the XGBoost model outperformed the rest of the models in the training cohort in our study, its performance in the validation cohort was quite ordinary.this could be attributed to XGBoost model overfitting.We agree with Xing et al. that linear sVM may be a better algorithm choice based on its balanced performance in the training and validation cohorts [26].the deep-learning model has a broader application prospect.however, according to the results by Wang li et al. and Ying chiqing et al., the single evaluation index of aUcs was no better than the Ml models [33,34].therefore, Ml algorithms have a role to play, especially when the sample size is relatively small.

Limitations and future improvement
this study is limited in certain respects.First, the sample size was relatively small, and we were unable to acquire a sufficient number of NtM-PD cases.second, the study data was from a single centre and lacked real external validation.third, the research solely relied on ct images, without considering the clinical characteristics, thereby potentially restricting the predictive capability of the Ml algorithms.Fourth, the underlying theories of various Ml algorithms can be complex and challenging to comprehend. to make these models more accessible for clinicians, not only should they be comprehensible but should also have the capability to estimate their uncertainties [35].

Conclusion
in this research, 23 and 29 radiomics features were selected based on the lassO regression model.No remarkable variation was observed between the models of the 23 and 29 radiomics features.Four predictive models were developed to examine the accuracy of distinguishing between NtM-PD and PtB.considering previous studies and our research, the sVM model may have a stable prediction accuracy according to the aUc analyses as compared to the other algorithms.the radiomics features derived from the ct images proved to be highly effective in distinguishing between NtM-PD and PtB.Notably, the radiomics analysis yielded a more accurate diagnosis compared to assessments made by radiologists.

Figure 1 .
Figure 1.The study flow chart.
were assessed via ROc curves and their corresponding aUcs to assess their prognostic accuracy.these values were computed for both the training (n = 307) and validation cohorts (n = 77).the Delong test suggested the lack of any remarkable variation between the 23 and 29 radiomics features (training cohort, p = 0.233; validation cohort, p = 0.5298).

Figure 2 .
Figure 2. The results of the lAsso regression.The lAsso binary regression model was employed for the selection of 29 (minimum) or 23 (1 standard deviation) radiomics features.lAsso: least absolute shrinkage and selection operator.

Figure 3 .
Figure 3.The importance of these selected features was examined through the lAsso regression model.(a) 23 features importance analysis based on lAsso regression.(b) 29 features importance analysis based on lAsso regression.
Yan et al. used six classifiers (K-Nearest Neighbors [KNN], a machine learning algorithm, sVM, XGBoost, RF, lR, and Decision tree [Dt]) in the radiomics study of lung cavities in 73 NtM-PD and 69 PtB patients, and external verification of 20 NtM-PD and 20 PtB patients

Figure 4 .
Figure 4. Receiver-operating characteristic curves and AUcs demonstrating the predictions of the four models: lR, Rf, sVM, and XGBoost.(a) The training cohort and (b) the validation cohort.AUc: area under the curve; lR: logistic regression; Rf: random forest; sVM: support vector machine; XGBoost: extreme gradient boosting.

Table 1 .
Baseline characteristics of patients with nTM-Pd/PTB.